51 research outputs found
SQUASH: Simple QoS-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators
Modern SoCs integrate multiple CPU cores and Hardware Accelerators (HWAs)
that share the same main memory system, causing interference among memory
requests from different agents. The result of this interference, if not
controlled well, is missed deadlines for HWAs and low CPU performance.
State-of-the-art mechanisms designed for CPU-GPU systems strive to meet a
target frame rate for GPUs by prioritizing the GPU close to the time when it
has to complete a frame. We observe two major problems when such an approach is
adapted to a heterogeneous CPU-HWA system. First, HWAs miss deadlines because
they are prioritized only close to their deadlines. Second, such an approach
does not consider the diverse memory access characteristics of different
applications running on CPUs and HWAs, leading to low performance for
latency-sensitive CPU applications and deadline misses for some HWAs, including
GPUs.
In this paper, we propose a Simple Quality of service Aware memory Scheduler
for Heterogeneous systems (SQUASH), that overcomes these problems using three
key ideas, with the goal of meeting deadlines of HWAs while providing high CPU
performance. First, SQUASH prioritizes a HWA when it is not on track to meet
its deadline any time during a deadline period. Second, SQUASH prioritizes HWAs
over memory-intensive CPU applications based on the observation that the
performance of memory-intensive applications is not sensitive to memory
latency. Third, SQUASH treats short-deadline HWAs differently as they are more
likely to miss their deadlines and schedules their requests based on worst-case
memory access time estimates.
Extensive evaluations across a wide variety of different workloads and
systems show that SQUASH achieves significantly better CPU performance than the
best previous scheduler while always meeting the deadlines for all HWAs,
including GPUs, thereby largely improving frame rates
The Blacklisting Memory Scheduler: Balancing Performance, Fairness and Complexity
In a multicore system, applications running on different cores interfere at
main memory. This inter-application interference degrades overall system
performance and unfairly slows down applications. Prior works have developed
application-aware memory schedulers to tackle this problem. State-of-the-art
application-aware memory schedulers prioritize requests of applications that
are vulnerable to interference, by ranking individual applications based on
their memory access characteristics and enforcing a total rank order.
In this paper, we observe that state-of-the-art application-aware memory
schedulers have two major shortcomings. First, such schedulers trade off
hardware complexity in order to achieve high performance or fairness, since
ranking applications with a total order leads to high hardware complexity.
Second, ranking can unfairly slow down applications that are at the bottom of
the ranking stack. To overcome these shortcomings, we propose the Blacklisting
Memory Scheduler (BLISS), which achieves high system performance and fairness
while incurring low hardware complexity, based on two observations. First, we
find that, to mitigate interference, it is sufficient to separate applications
into only two groups. Second, we show that this grouping can be efficiently
performed by simply counting the number of consecutive requests served from
each application.
We evaluate BLISS across a wide variety of workloads/system configurations
and compare its performance and hardware complexity, with five state-of-the-art
memory schedulers. Our evaluations show that BLISS achieves 5% better system
performance and 25% better fairness than the best-performing previous scheduler
while greatly reducing critical path latency and hardware area cost of the
memory scheduler (by 79% and 43%, respectively), thereby achieving a good
trade-off between performance, fairness and hardware complexity
The Blacklisting Memory Scheduler: Achieving high performance and fairness at low cost
Abstract—In a multicore system, applications running on different cores interfere at main memory. This inter-application interference degrades overall system performance and unfairly slows down applications. Prior works have developed application-aware memory request schedulers to tackle this problem. State-of-the-art application-aware memory request schedulers prioritize memory requests of applications that are vulnerable to interfer-ence, by ranking individual applications based on their memory access characteristics and enforcing a total rank order. In this paper, we observe that state-of-the-art application-aware memory schedulers have two major shortcomings. First, ranking applications individually with a total order based on memory access characteristics leads to high hardware cost and complexity. Second, ranking can unfairly slow down applications that are at the bottom of the ranking stack. To overcome thes
- …